An Upper-Bound on Information Contained Within a Tweet

نویسنده

  • Karl Koscher
چکیده

While tweets (and this paper) are limited to 140 characters, not all characters are created equal. This paper explores abuses of character encoding schemes to maximize the number of bits that can be conveyed by a tweet. In particular, since Twitter supports Unicode, we examine how we can abuse UTF8. For example, while people equate a Unicode codepoint with a character, some can be combined to form a single character. Does Twitter count these as one or two characters? Furthermore, some encodings (such as UTF8) allow more codepoints than are specified by Unicode – does Twitter accept these too? We ignore external links, embedded media, Twitter entities, and geotags, which are not universally supported. BODYMax bits/tweet? UTF8=31b/chr Can use chrs forbidden by RFC3629 Com-posing chars count as distinct codepts Max bits is 4339 (-1 for ctrl chrs) REFERENCES[1] UTF-8. http://en.wikipedia.org/wiki/UTF-8, July 2012.[2] Twitter, Inc. Counting characters. https://dev.twitter.com/docs/counting-characters, Apr.2012.[3] F. Yergeau. UTF-8, a transformation format of ISO 10646. RFC 3629 (Standard), Nov. 2003. Volume 1 of Tiny Transactions on Computer ScienceThis content is released under the Creative Commons Attribution-NonCommercial ShareAlike License. Permission tomake digital or hard copies of all or part of this work is granted without fee provided that copies are not made ordistributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.CC BY-NC-SA 3.0: http://creativecommons.org/licenses/by-nc-sa/3.0/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Upper Bound on the First Zagreb Index in Trees

In this paper we give sharp upper bounds on the Zagreb indices and characterize all trees achieving equality in these bounds. Also, we give lower bound on first Zagreb coindex of trees.

متن کامل

On trees attaining an upper bound on the total domination number

‎A total dominating set of a graph $G$ is a set $D$ of vertices of $G$ such that every vertex of $G$ has a neighbor in $D$‎. ‎The total domination number of a graph $G$‎, ‎denoted by $gamma_t(G)$‎, ‎is~the minimum cardinality of a total dominating set of $G$‎. ‎Chellali and Haynes [Total and paired-domination numbers of a tree, AKCE International ournal of Graphs and Combinatorics 1 (2004)‎, ‎6...

متن کامل

An Upper Bound Approach for Analysis of Hydroforming of Sheet Metals

Considering a kinematical velocity admissible field, the upper bound method has beenused for predicting the amount of pressure in hydroforming of sheet metals. The effects of workhardening, friction and blank size have been considered in pressure prediction. Also the effect ofsheet thickness variation has been considered in the present work formulations. The relation betweenpressure and punch s...

متن کامل

A generalized upper bound solution for bimetallic rod extrusion through arbitrarily curved dies

In this paper, an upper bound approach is used to analyze the extrusion process of bimetallic rods through arbitrarily curved dies. Based on a spherical velocity field, internal, shearing and frictional power terms are calculated. The developed upper bound solution is used for calculating the extrusion force for two types of die shapes: a conical die as a linear die profile and a streamlined di...

متن کامل

An Upper Bound Analysis of Sandwich Sheet Rolling Process

In this research, flat rolling process of bonded sandwich sheets is investigated by the method of upper bound. A kinematically admissible velocity field is developed for a single layer sheet and is extended into the rolling of the symmetrical sandwich sheets. The internal, shear and frictional power terms are derived and they are used in the upper bound model. Through the analysis, the rolling ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • TinyToCS

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2012